AITopics | high dimensional data

Collaborating Authors

high dimensional data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Diffusion Curvature for Estimating Local Curvature in High Dimensional Data

Neural Information Processing SystemsDec-24-2025, 17:12:49 GMT

We introduce a new intrinsic measure of local curvature on point-cloud data called diffusion curvature. Our measure uses the framework of diffusion maps, including the data diffusion operator, to structure point cloud data and define local curvature based on the laziness of a random walk starting at a point or region of the data. We show that this laziness directly relates to volume comparison results from Riemannian geometry. We then extend this scalar curvature notion to an entire quadratic form using neural network estimations based on the diffusion map of point-cloud data. We show applications of both estimations on toy data, single-cell data, and on estimating local Hessian matrices of neural network loss landscapes.

curvature, diffusion curvature, local curvature, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Cloud Computing (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)

Add feedback

Autoregressive Score Matching

Neural Information Processing SystemsOct-2-2025, 20:37:58 GMT

Compared to previous score matching algorithms, our method is more scalable to high dimensional data and more stable to optimize.

artificial intelligence, csm, machine learning, (12 more...)

Neural Information Processing Systems

Country: North America (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Into the Void: Mapping the Unseen Gaps in High Dimensional Data

Zhang, Xinyu, Estro, Tyler, Kuenning, Geoff, Zadok, Erez, Mueller, Klaus

arXiv.org Artificial IntelligenceJan-25-2025

We present a comprehensive pipeline, augmented by a visual analytics system named ``GapMiner'', that is aimed at exploring and exploiting untapped opportunities within the empty areas of high-dimensional datasets. Our approach begins with an initial dataset and then uses a novel Empty Space Search Algorithm (ESA) to identify the center points of these uncharted voids, which are regarded as reservoirs containing potentially valuable novel configurations. Initially, this process is guided by user interactions facilitated by GapMiner. GapMiner visualizes the Empty Space Configurations (ESC) identified by the search within the context of the data, enabling domain experts to explore and adjust ESCs using a linked parallel-coordinate display. These interactions enhance the dataset and contribute to the iterative training of a connected deep neural network (DNN). As the DNN trains, it gradually assumes the task of identifying high-potential ESCs, diminishing the need for direct user involvement. Ultimately, once the DNN achieves adequate accuracy, it autonomously guides the exploration of optimal configurations by predicting performance and refining configurations, using a combination of gradient ascent and improved empty-space searches. Domain users were actively engaged throughout the development of our system. Our findings demonstrate that our methodology consistently produces substantially superior novel configurations compared to conventional randomization-based methods. We illustrate the effectiveness of our method through several case studies addressing various objectives, including parameter optimization, adversarial learning, and reinforcement learning.

high dimensional data, machine learning, reinforcement learning, (4 more...)

arXiv.org Artificial Intelligence

2501.15273

Genre: Research Report > New Finding (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback

Diffusion Curvature for Estimating Local Curvature in High Dimensional Data

Neural Information Processing SystemsJan-17-2025, 12:51:19 GMT

curvature, diffusion curvature, high dimensional data, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Cloud Computing (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.58)

Add feedback

Sparse Modelling for Feature Learning in High Dimensional Data

Neelam, Harish, Veerella, Koushik Sai, Biswas, Souradip

arXiv.org Artificial IntelligenceSep-28-2024

This paper presents an innovative approach to dimensionality reduction and feature extraction in high-dimensional datasets, with a specific application focus on wood surface defect detection. The proposed framework integrates sparse modeling techniques, particularly Lasso and proximal gradient methods, into a comprehensive pipeline for efficient and interpretable feature selection. Leveraging pre-trained models such as VGG19 and incorporating anomaly detection methods like Isolation Forest and Local Outlier Factor, our methodology addresses the challenge of extracting meaningful features from complex datasets. Evaluation metrics such as accuracy and F1 score, alongside visualizations, are employed to assess the performance of the sparse modeling techniques. Through this work, we aim to advance the understanding and application of sparse modeling in machine learning, particularly in the context of wood surface defect detection.

dataset, defect detection, detection, (10 more...)

arXiv.org Artificial Intelligence

2409.19361

Country:

Asia > India > Telangana > Hyderabad (0.06)
North America > United States > Michigan > Ingham County > Lansing (0.05)
North America > United States > Michigan > Ingham County > East Lansing (0.05)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Formation-Controlled Dimensionality Reduction

Jeong, Taeuk, Jung, Yoon Mo

arXiv.org Artificial IntelligenceApr-10-2024

Dimensionality reduction represents the process of extracting low dimensional structure from high dimensional data. High dimensional data include multimedia databases, gene expression microarrays, and financial time series, for example. In order to deal with such real-world data properly, it is better to reduce its dimensionality to avoid undesired properties of high dimensions such as the curse of dimensionality [14, 11]. As a result, classification, visualization, and compression of data can be expedited, for example [14]. In many problems, it is presumed that the dimensionality of the measured data is only artificially high; the measured data are high-dimensional but data nearly have a lower-dimensional structure, since they are multiple, indirect measurements of an underlying factors, which typically cannot be directly calibrated [4].

dataset, formation control, neighbor point, (13 more...)

arXiv.org Artificial Intelligence

2404.06808

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre:

Research Report (0.50)
Overview (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.64)

Add feedback

ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning

Neural Information Processing SystemsMar-14-2024, 23:39:17 GMT

Independent Components Analysis (ICA) and its variants have been successfully used for unsupervised feature learning. However, standard ICA requires an orthonoramlity constraint to be enforced, which makes it difficult to learn overcomplete features. In addition, ICA is sensitive to whitening. These properties make it challenging to scale ICA to high dimensional data. In this paper, we propose a robust soft reconstruction cost for ICA that allows us to learn highly overcomplete sparse features even on unwhitened data.

constraint, ica, reconstruction cost, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bayesian Probabilistic Co-Subspace Addition

Neural Information Processing SystemsMar-14-2024, 13:49:00 GMT

For modeling data matrices, this paper introduces Probabilistic Co-Subspace Addition (PCSA) model by simultaneously capturing the dependent structures among both rows and columns. Briefly, PCSA assumes that each entry of a matrix is generated by the additive combination of the linear mappings of two low-dimensional features, which distribute in the row-wise and column-wise latent subspaces respectively. In consequence, PCSA captures the dependencies among entries intricately, and is able to handle non-Gaussian and heteroscedastic densities. By formulating the posterior updating into the task of solving Sylvester equations, we propose an efficient variational inference algorithm. Furthermore, PCSA is extended to tackling and filling missing values, to adapting model sparseness, and to modelling tensor data. In comparison with several state-of-art methods, experiments demonstrate the effectiveness and efficiency of Bayesian (sparse) PCSA on modeling matrix (tensor) data and filling missing values.

matrix, pcsa, tensor, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Add feedback

Sample Complexity of Testing the Manifold Hypothesis

Neural Information Processing SystemsFeb-16-2024, 10:20:17 GMT

The hypothesis that high dimensional data tends to lie in the vicinity of a low dimensional manifold is the basis of a collection of methodologies termed Manifold Learning. In this paper, we study statistical aspects of the question of fitting a manifold with a nearly optimal least squared error. Given upper bounds on the dimension, volume, and curvature, we show that Empirical Risk Minimization can produce a nearly optimal manifold using a number of random samples that is {\it independent} of the ambient dimension of the space in which data lie. We obtain an upper bound on the required number of samples that depends polynomially on the curvature, exponentially on the intrinsic dimension, and linearly on the intrinsic volume. For constant error, we prove a matching minimax lower bound on the sample complexity that shows that this dependence on intrinsic dimension, volume and curvature is unavoidable.

dimension, frac, manifold hypothesis, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.60)

Add feedback

Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data

Neural Information Processing SystemsApr-6-2023, 16:07:39 GMT

In this paper we introduce a new underlying probabilistic model for prin- cipal component analysis (PCA). Our formulation interprets PCA as a particular Gaussian process prior on a mapping from a latent space to the observed data-space. We show that if the prior's covariance func- tion constrains the mappings to be linear the model is equivalent to PCA, we then extend the model by considering less restrictive covariance func- tions which allow non-linear mappings. This more general Gaussian pro- cess latent variable model (GPLVM) is then evaluated as an approach to the visualisation of high dimensional data for three different data-sets. Additionally our non-linear algorithm can be further kernelised leading to'twin kernel PCA' in which a mapping between feature spaces occurs.

gaussian process latent variable model, high dimensional data, mapping, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback